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Abstract 

For a Gaussian measure on a separable Hilbert space with covariance operator C, 
we show that the family of conditional measures associated with conditioning on a 
closed subspace S 1 - are Gaussian with covariance operator the short 5 (C) of the op¬ 
erator C to S. We provide two proofs. The first uses the theory of Gaussian Hilbert 
spaces and a characterization of the shorted operator by Andersen and Trapp. The 
second uses recent developments by Corach, Maestripieri and Stojanoff on the re¬ 
lationship between the shorted operator and C-symmetric oblique projections onto 
5V To obtain the assertion when such projections do not exist, we develop an 
approximation result for the shorted operator by showing, for any positive operator 
A, how to construct a sequence of approximating operators A n which possess A n - 
symnretric oblique projections onto S^ such that the sequence of shorted operators 
S(A n ) converges to 5(A) in the weak operator topology. This result combined with 
the martingale convergence of random variables associated with the corresponding 
approximations C n establishes the main assertion in general. Moreover, it in turn 
strengthens the approximation theorem for shorted operator when the operator is 
trace class; then the sequence of shorted operators 5(A n ) converges to 5(A) in trace 
norm. 


1 Introduction 

For a Gaussian measure /i with injective covariance operator C on a direct sum of 
finite dimensional Hilbert spaces H = H\ ® , the conditional measure associated 

with conditioning on the value of H 2 can be computed in terms of the Schur com¬ 
plement corresponding to the partitioning of the covariance matrix C, see Cottle [ 
for a review. Evidently, the natural extension to infinite dimensions of the Schur 
complement is the shorted operator , first discovered by Krein [22] and developed 
in Anderson and Trapp [2] based on results on operator ranges of Douglas [13] and 
Fillmore and Williams [16]. For related results, and a history, see Pekarev [30]. How¬ 
ever, the connection between the shorted operator and the covariance operator of the 
conditional Gaussian measure on an infinite dimensional Hilbert space appears yet 
to be established. Indeed, Hairer, Stuart, Voss, and Wiber [18, Lem. 4.3], see also 
Stuart [35, Thm. 6.20], characterizes the conditional measure through a measurable 
extension result of Dalecky and Fomin [11, Thm. II.3.3] of an operator defined on 
the Cameron-Martin reproducing kernel Hilbert space. For other representations, 
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see Mandelbaum [27], LaGatta [25], and Tarieladze and Vakhania’s [38] extension 
of the optimal linear approximation results of Lee and Wasilkowski [26] from finite 
to infinite rank. Tarieladze [37] asserts that this latter result extends one in the 
Information-Based Complexity of Traub, Wasilkowski and Wozniakowski [39] which 
is relevant to Grid Computing as described in Foster and Kesselman [17]. The pri¬ 
mary purpose of this paper is to instead represent the conditional measure in terms 
of the shorted operator. We provide two distinct proofs of this representation. The 
first uses the theory of Gaussian Hilbert spaces and a characterization of the shorted 
operator by Andersen and Trapp. The second proof, corresponding to the secondary 
purpose of this paper, uses recent developments by Corach, Maestripieri and Sto- 
janoff on the relationship between the shorted operator and A-symmetric oblique 
projections. This latter approach has the advantage that it facilitates a general 
approximation technique that not only can be used to approximate the covariance 
operator but the conditional expectation operator. This is accomplished through the 
development of an approximation theory for the shorted operator in terms of oblique 
projections followed by an application of the martingale convergence theorem. Al¬ 
though the proofs are not fundamentally difficult, the result (which appears to have 
been missed in the literature) provides a simple characterization of the conditional 
measure, leading to significant approximation results. For instance, the attainment 
of the main result through the martingale approach feeds back a strengthening of 
the approximation theorem for the shorted operator that was developed for that 
purpose: when the operator is trace class the approximation improves from weak 
convergence to convergence in trace norm. 

Let us review the basic results on Gaussian measures on Hilbert space. A measure 
/r on a Hilbert space H is said to be Gaussian if, for each h € H considered as a 
continuous linear function h : H —> M by h[x) := (h,x),x € H, we have that 
the pushforward measure h*fi is Gaussian, where we say that a Dirac measure is 
Gaussian. For a Gaussian measure /i, its mean m is defined by 

(h,m) := / {h,x)d/i(x), h&H 

JH 

and its covariance operator C : H —> H is defined by 

(C7ii,/i 2 ):= / (hi, x)(h 2 , x)dfi(x) — (hi, m)(h 2 , m), h\,h 2 £ H. 

J H 

A Gaussian measure has a well defined mean and a continuous covariance operator, 
see e.g. Da Prato and Zabczyk [10, Lem. 2.14], Finally, Mourier’s Theorem [29], see 
Vakhania, Tarieladze and Chobanyan [40, Thm. IV.2.4], asserts, for any m £ H and 
any positive symmetric trace class operator C, that there exists a Gaussian measure 
with mean m and covariance operator C, and that all Gaussian measures have a 
well defined mean and positive trace class covariance operator. This characterization 
also follows from Sazonov’s Theorem [34, Thm. 1], 

Since separable Hilbert spaces are Polish, it follows from the product space ver¬ 
sion, see e.g. Dudley [14, Thm. 10.2.2], of the theorem on the existence and unique¬ 
ness of regular conditional probabilities on Polish spaces, that any Gaussian measure 
fi on a direct sum H = Hi®H 2 of separable Hilbert spaces has a regular conditional 
probability, that is there is a family /j , t , t € H 2 of conditional measures correspond¬ 
ing to conditioning on H 2 . Moreover, Tarieladze and Vakhania [38, Thm. 3.11] 
demonstrate that the corresponding family of conditional measures are Gaussian. 
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Bogachev’s [ , Thm. 3.10.1] theorem of normal correlation of Hilbert space valued 
Gaussian random variables shows that if two Gaussian random vectors £ and 77 on 
a separable Hilbert space H are jointly Gaussian in the product space, then E[£|t 7 ] 
is a Gaussian random vector and £ = E[^|ry] + £ where ( is Gaussian random vector 
which is independent of rj. Consequently, for any two vectors 6 ff we have 


E 


(£ - ®[€\v], hi)(Z - E KM» h 2 )\v 


= E 
= E 


(C,h 1 )(C,h 2 )\v 
(C,hi)(C,h 2 ) 


and so we conclude that, just as in the finite dimensional case, the conditional 
covariance operators are independent of the values of the conditioning variables. 

Since both proof techniques will utilize the characterization of conditional ex¬ 
pectation as orthogonal projection, we introduce these notions now. Consider the 
Lebesgue-Bochner space L 2 (H, p, 13(H)) space of (equivalence classes) of H-v alued 
Borel measurable functions on H whose squared norm 


ll/lll 2 (ff,^,B(ff)) f ll/O r )llff^M a: ) 

J H 

is integrable. For a sub er-algebra E C B(H) of the Borel cr-algebra, consider the 
corresponding Lebesgue-Bochner space L 2 (H, p, E). As in the scalar case, one can 
show that L 2 (H, p,B(H)) and L 2 (H, p, E) are Hilbert spaces and that L 2 (H , p, E) C 
L 2 (H , p,B(H)) is a closed subspace. Then, if we note that contractive projections on 
Hilbert space are orthogonal, see Rao [31, Rmk. 9, pg. 51], it follows from Sundaresan 
[36, Prop. 4], see Diestel and Uhl [12, Thm. V.1.4], that conditional expectation 
amounts to orthogonal projection. 


2 Shorted Operators 

A symmetric operator A : H —> H is called positive if (Ax, x) > 0 for all x £ H. 
We denote by L + (H) the set of positive operators and we denote such positivity by 
A ^ 0. Positivity induces the (Lowner) partial order A on L + (H). For a closed 
subspace S C H and a positive operator A £ L + (H) consider the set 

H(A, S) := {X G L+(H) : X ^ A and R(X) C S} . 

Then, according to Pekarev [30] , Krein [22] and later Anderson and Trapp [2] showed 
that H(A, S) contains a maximal element, which we denote by 5(A) and call the 
short of A to S. For another closed subspace T C H, we denote the short of A 
to T by T(A). In the proof, Anderson and Trapp [2] demonstrate that when A is 
invertible, that in terms of its (S, S^) partition representation 

A=(f ss 

VAs-LS 

that A s ± s ± is invertible and 
5(A) = 


As ± s ± J 


L ^ s i s iAsi£; 0\ 

0 0 ) ' 
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It is easy to show that the assertion holds under the weaker assumption that Agxgj. 
be invertible. Moreover, Anderson and Trapp [2, Cor. 1] asserts for A,Bg L + (H), 
that 

A^B => S(A) A S(B ), 

that is, S is monotone in the Lowner ordering. In addition, [2, Cor. 5] asserts that 
for two closed subspaces S and T, we have 

(SnT)(A) = S(T(A)). 

Finally, [2, Thm. 6] asserts that if A : H — > H is a positive operator and S C H is 
a closed linear subspace, then 

(5(A) s , s )=inf{(AQ,Q),te5 x }, Vs £5. (2.1) 

In Section 4.1 we demonstrate how the characterization (2.1) of the shorted operator 
combined with the theory of Gaussian Hilbert spaces provides a natural proof of our 
main result, the following theorem. Here we consider direct sum split H = Hi (BH 2 , 
and let S' = Hi and S 1 - = H 2 , so that the short S(A) of an operator to the subspace 
S = H 1 will be written as Hi (A). 

Theorem 2.1. Consider a Gaussian measure p on an orthogonal direct sum H = 
Hi ® H 2 of separable Hilbert spaces with mean m and covariance operator C. Then 
for all t £ H 2 , the conditional measure pt is a Gaussian measure with covariance 
operator Hi(C). 


3 Oblique Projections 

In this section, we will prepare for an alternative proof of Theorem 2.1 using oblique 
projections along with the development of approximations of the covariance operator 
and the conditional expectation operator generated by natural sequences of oblique 
projections. To that end, let us introduce some notations. For a separable Hilbert 
space H , we denote the usual, or strong, convergence of sequences by h n —> h and 
the weak convergence by h n ^4 h. Let L(H) denote the Banach algebra of bounded 
linear operators on H. For an operator A £ L(H ), we let R(A) denote its range 
and ker(A) denote its nullspace. Recall the uniform operator topology on L(H) 
defined by the metric ||A|| := sup^n^ ||A/i||. We say that a sequence of operators 
A n £ L(H) converges strongly to A £ L(H), that is 

A — S-lnTln — ioqA 

if A n h Ah for all h £ H, and we say that A n —> A weakly or 

A = u>-lim n ^ f00 A n 

if A n h Ah for all h £ H. Recall that an operator A £ L(H) is called trace class 
if the trace norm 

OO 

2=1 
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is finite for some ortho normal basis, where | A| := yj A* A is the absolute value. When 
it is finite, then fr(A) := ( Aei,ei ) is well defined, and for all positive trace 

class operators A we have tr(A) = ||A||i. The trace norm || • ||i makes the subspace 
Li(H) C L(H) of trace class operators into a Banach space. It is well known that 
the sequence of operator topologies 

weak —> strong —> uniform operator —» trace norm 
increases from left to right in strength. 

For a positive operator A : H —> H, let us define the set of (A-symmetric) oblique 
projections 

V(A, S x ) := {Q € L(H) :Q 2 = Q, R(Q) = S ± , AQ = Q*A} 

onto where Q* is the adjoint of Q with respect to the scalar product (•,•) on 
H. The pair (A, 5 X ) is said to be compatible , or S ± is said to be compatible with 
A, if V(A,S ± ) is nonempty. For any oblique projection Q G T^A, 5 X ), Corach, 
Maestripieri and Stojanoff [7, Prop. 4.2] asserts that for E := 1 — Q, we have 

S(A) = AE = E*AE. (3.1) 

Moreover, when (A, 5'~ L ) is compatible, according to Corach, Maestripieri and Sto¬ 
janoff [7, Def. 3.4], there is a special element Qa,s -l G V(A,S j ~) defined in the fol¬ 
lowing way: by [7, Prop. 3.3] and the factorization theorem [7, Thm. 2.2] of Douglas 
[13] and Fillmore and Williams [16], there is a unique operator Q : S —>■ S- 1 which 
satisfies ^ 5 ^ 5 ^ Q = such that ker(Q) = ker(A s ± s ) and R(Q ) C R(A S ± S ±). 

Defining 

Qa ’ s± = {q l) ’ (3 ' 2) 

[ 7 , Thm. 3.5] asserts that Qa,s ± G V(A 7 S^). 

When the pair (A, S ± ) is not compatible, we seek an approximating sequence A n 
to A which is compatible with S’ -1 , such that the limit of 5(A") is 5(A). Although 
Anderson and Trapp [2, Cor. 2] show that if A n is a monotone decreasing sequence 
of positive operators which converge strongly to A , that the decreasing sequence of 
positive operators 5(A n ) strongly converges to 5(A), the approximation from above 
by A" := A + determines operators which are not trace class, so is not useful 
for the approximation problem for the covariance operators for Gaussian measures. 
Since the trace class operators are well approximated from below by finite rank 
operators one might hope to approximate A by an increasing sequence of finite rank 
operators. However, it is easy to see that, in general, the same convergence result 
does not hold for increasing sequences. The following theorem demonstrates, for 
any positive operator A, how to produce a sequence of positive operators A" which 
are compatible with S 1 - such that 5(A") weakly converges to 5(A). 

Henceforth we consider a direct sum split H = H 1 © H 2 , and let 5 = H 1 and 
5- 1 - = H 2 , so that the short 5(A) of an operator to the subspace 5 = Hi will be 
written as R\{A). Let us also denote by Pi : H —> H the orthogonal projections 
onto Hi, for i = 1,2, and let : H —> Hi denote the corresponding projections and 
n* : Hi —>• H the corresponding injections. For any operator A : H —» H, consider 
the decomposition 

a=( a , 11 a , 12 ) 

\A -21 A 22 ) 
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where the components are defined by A t j := Ip All*, i. j = 1,2. 

Theorem 3.1. Consider a positive operator A : H — >• H on a separable Hilbert 
space. Then for any orthogonal split H = H i ® H 2 , and any ordered orthonormal 
basis of H 2 , we let HJf denote the span of the first n basis elements and let P n := 
Ph 1 + Ph% denote the orthogonal projection onto Hi © H £. Then the sequence of 
positive operators 

A n := P n AP n , n = 1,... 

is compatible with H 2 and 


Hi{A) = u>-lim n ^. ocl 'Hi(A n ). 


Remark 3.2. For an increasing sequence A n of positive operators converging strongly 
to A, the monotonicity of the shorting operation implies that the sequence r Hi(A n ) 
is increasing, and therefore Vigier’s Theorem, see e.g. Halmos [19, Prb. 120], implies 
that the sequence Hi(A n ) converges strongly. Although the sequence A n := P n AP n 
defined in Theorem 3.1 is positive and converges strongly to A , in general, it is not 
increasing in the Lowner order, so that Vigier’s Theorem does not apply, possibly 
suggesting why we only obtain convergence in the weak operator topology. With 
stronger assumptions on the operator A and a well chosen selection of an ordered 
orthonormal basis of H 2 , we conjecture that convergence in a stronger topology may 
be available. In particular, as a corollary to our main result, when A is trace class, 
we establish in Corollary 3.4 that 

Hi(A n ) —> Hi(A) in trace norm. 

For any to £ R, we let m = (mi, m 2 ) denote its decomposition in H = Hi © i? 2 - 
Moreover, for any projection Q : H —► H with R(Q) = H 2 we let Q : Hi —> H 2 
denote the unique operator such that 


Q = 


0 0 
Q 1 


and denote by Q* : H 2 —> Hi the adjoint of Q defined by the relation ( Q*h 2 , hi)u x = 
(h 2 ,Qhi)H 2 for all hi £ Hi, h 2 £ H 2 . 

The following theorem constitutes an expansion of our main result, Theorem 2.1, 
to include natural approximations for the conditional covariance operator and the 
conditional expectation operator. 

Theorem 3.3. Consider a Gaussian measure /i on an orthogonal direct sum H = 
Hi © H 2 of separable Hilbert spaces with mean m and covariance operator C. Then 
for all t £ H 2 , the conditional measure pt is a Gaussian measure with covariance 
operator 'Hi(C'). 

If the covariance operator C is compatible with H 2 , then for any oblique projection 
Q in V(C, H 2 ) 7 ^ 0, the mean mt of the conditional measure pt is 


m t 


^TOi + Q*(t - to 2 )^ 


In the general case, for any ordered orthonormal basis for H 2 , let Hlf denote the span 
of the first n basis elements, let P n := Ph 1 + Ph% denote the orthogonal projection 
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onto H i ® Hlf, and define the approximate C n := P n CP n . Then C n is compatible 
with H 2 for all n, and for any sequence Q n £ V(C n , H 2 ) ^ 0 of oblique projections, 
we have 



for p-almost every t. If the sequence Q n eventually becomes the special element 
Qn = Qc n ,H 2 defined near (3.2), then we have 



for p-almost every t. 

As a corollary to Theorem 3.3, we obtain a strengthening of the assertion of 
Theorem 3.1 when the operator A is trace class. 

Corollary 3.4. Consider the situation of Theorem 3.1 with A trace class. Then 


Hi(A n ) —> TLi(A) in trace norm. 


4 Proofs 

4.1 First proof of Theorem 2.1 

Consider the Lebesgue-Bochner space L 2 (H, p,B(H)) space of (equivalence classes) 
of 71-valued Borel measurable functions on H whose squared norm 



is integrable. For any square Bochner integrable function / £ L 2 (H, p, B(H)) and 
any h £ H, we have that (/, h) is square integrable, that is (/, h) £ L 2 (R, p, 13(H)). 
Moreover, it is easy to see, see e.g. [1, Lem. 11.45], that if / is Bochner integrable, 
then for all h £ H, we have (/, h) is Bochner integrable and / (/, h)dp = (f fdp , h ). 

Now consider the orthogonal decomposition H = Hi ® H -2 and the Borel a- 
algebra B(H 2 ). Let us denote the shorthand notation 


B := B(H), B 2 :={(H 1 ,T):T£B(H 2 )}. 


The definition of conditional expectation in Lebesgue-Bochner space, that is that 
E[/|7? 2 ] is the unique /r-almost everywhere 62 -measurable function such that 



combined with Hille’s theorem [12, Thm. II. 6 ], that for each h £ H we have 



and 
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implies that 


E[{h,f)\B 2 ] = (h,E[f\B 2 }), heH 


thus implying the following commutative diagram for all h £ H: 


-' - —— l 2 (ii.,,.b,) 


(h 


(h 


L 2 (R, h, B) -——^-- L 2 (R, //, B 2 ) 


(4.1) 


When n is a Gaussian measure, the theory of Gaussian Hilbert spaces, see 
e.g. Jansen [20], provides a stronger characterization of conditional expectation of 
the canonical random variable X (/;) := h,h £ H when conditioning on a subspace 
and captures the full linear nature of Gaussian conditioning. Let us assume hence¬ 
forth that /i is a centered Gaussian measure. Then Fernique’s Theorem [15], see [10, 
Thm. 2.6], implies that the random variable X is square Bochner integrable. For 
any element h £ H, let us denote the corresponding function ^ : H — > R defined by 
G,(/d) = (h,h'),h' £ H. Then the the discussion above shows that for any h £ H, 
that the real-valued random variable £/, is square integrable, that is £/, £ L 2 (R, fi, B), 
for all h £ H. Let 

£:#->• L 2 (R,^,B) 

denote the resulting linear mapping defined by 

h i—> Gi € L 2 (R, n , B) : h £ H . 

It is straightforward to show that £ is injective if and only if the covariance operator 
C of the Gaussian measure /r is injective. By the definition of a centered Gaussian 
vector X, it follows that the law (£h)*B hr R is a univariate centered Gaussian mea¬ 
sure, that is £h is a centered Gaussian real-valued random variable. Consequently, 
let us consider the closed linear subspace 

:=fpO C L 2 (R }f i,B) 

generated by the elements ly, £ L 2 (R, p,, B) , h £ H. By Jansen [20, Thm. 1.1.3], 
this closure H M C L 2 (M,/r, B) also consists of centered Gaussian random variables, 
and since it is a closed subspace of a Hilbert space, it is a Hilbert space and there¬ 
fore a Gaussian Hilbert space as defined in Jansen [20]. Moreover, by Jansen [20, 
Thm. 8.15], H^ is a feature space for the Cameron-Martin reproducing kernel Hilbert 
space with feature map £ : H —> H 11 and reproducing kernel the covariance operator. 
For a closed Hilbert subspace, H 2 C H, we can consider the closed linear subspace 

^:=P 2 )cL 2 (R,,i,B 2 ) 

generated by the elements Gi 2 ■ h ‘2 £ H 2 in the same way. is also a Gaussian 
Hilbert space and we have the natural subspace identification H% C H Since 
separable Hilbert spaces are Polish, and an orthonormal basis is a separating set, it 








follows, see e.g. Vakhania, Tarieladze and Chobanyan [40, Thm. 1.1.2], that for an 
orthonormal basis e*, i £ I of a separable Hilbert space, that the u-algebra generated 
by the corresponding real-valued functions a ({£ eiI * £ /}) is the Borel er-algebra of 
the Hilbert space. Consequently, we obtain from Janson [20, Thm. 9.1] that for any 
h £ H, that 


E[&|B 2 ] = E[&|a(U £ ha ,h2£H 2 )] 

= Pltl^h 


where P h ij : H M —> H£ is orthogonal projection. That is, if we let E[-|£? 2 ] ; 
L 2 (R,/q£?) —> L 2 ( R,/r,H 2 ) be the conditional expectation represented as orthog¬ 
onal projection and E[-|S 2 ] : be the conditional expectation repre¬ 

sented as orthogonal projection from the linear subspace H M C L 2 (R,/x,H) onto 
the closed subspace H% C H we have the following commutative diagram, where 
—> T 2 (R, and i H g : H% —f L 2 (R, //, S 2 ) denote the closed subspace 

injections. 

L 2 (R,/i,£)-—^--L 2 (R,/x,H 2 ) 


■hi* 




E[- | B 2 ] 




(4.2) 


which when combined with Figure 4.1, representing the commutativity of vector 
projection and conditional expectation, produce the following commutative diagram 
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for all h € H: 


L 2 (H,^B) -—--- /- 2 (//./'•«-•) 


(h 


{h 


L 2 (R, (ji, B) - L 2 (R, h,B 2 ) 


1 hH 


H» 


E[- | B 2 ] 




(4.3) 


H 


H 2 


Although there is a natural projection map Ph 2 '■ H —> Hi for the bottom of this 
diagram, in general it cannot be inserted here and maintain the commutativity of 
the diagram. This comes from the fact that there may exist an h € H such that 
= 0. However, this does not imply that £,p H2 h = 0. 

We are now prepared to obtain the main assertion. The covariance operator of 
the random variable X is defined by 

(CM'} = E li [(X,h)(X,h')] 

= [&&'], h,ti e H. 

Moreover, by the theorem of normal correlation and the commutativity of the dia¬ 
gram (4.1), the conditional covariance operator is defined by 

(C{X\X 2 )h,h’) = eJ(X -E[X\B 2 ],h)(X -E[X\B 2 \,h')\B 2 


= E„ 


= 


(X - E[X\B 2 ],h)(X - E[X\B 2 ], ti) 
^ h -E[^ h \Bi})(^-E{^\B 2 ]) 


h,ti G H . 


In terms of the Gaussian Hilbert spaces C H 1 ', using the commutativity of the 
diagram (4.2) and the identification of the conditional expectation with orthogonal 
projection, we conclude that 

(C/i.fc') = (&,&,)*„, h,h'eH (4.4) 

and 

(C(X\X 2 )h,h') = ((I-P H ^ h ,(I-P H ^ h> ) H ., h,h’eH. (4.5) 
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Since the orthogonal projection P H g is a metric projection of H^ onto H£, we can 
express the dual optimization problem to the metric projection as follows: for any 
h £ H, using the decomposition h = hi + h 2 with hi £ H±, h 2 £ H 2 , we decompose 
€h = €hi+h a = + €h 2 - Then, noting that (/ - P H %)£h 2 = 0, we obtain 

= ll^hi 

= 11(1 — ^~*H 2 )(£^1 + ^ 2 ) 11 ^ + + £/i 2 )llff»* 

~ 11(1 — Ph^O^/ii II+ \\PH%ihi + £h 2 ||flc • 

Since in the second term on the right-hand side Pu^h^ G -ff 2 , there is a sequence 
h%,n = 1,... such that the corresponding sequence £/,« converges to —Pu^h! in 
L 2 (R,/q£?) and therefore H we conclude that 

11(1 — PH^)ih x \\h» = inf ||£/u + £/i 2 ||/j> • 

* h 2 £H 2 

From the identifications (4.18) and (4.5), we conclude that 

(C(X\X 2 )hi,h\) = inf (C(X)(hi + h 2 ) : h\ + h 2 \, . 

Therefore, Anderson and Trapp [2, Thm. 6] implies the assertion 

C{X\X 2 ) = Ui[C). 

The assertion in the non-centered case follows by simple translation. 

4.2 Proof of Theorem 3.1 

Since Ph 2 A n Pjj 2 = Pn% A” Pjjn , the range of Ph 2 A u Ph 2 is finite dimensional, 
and therefore closed, so that it follows from Corach, Maestripieri and Stojanoff 
[8, Lem. 3.8] that A n is compatible with H 2 for all n. 

Now we utilize the approximation results of Butler and Morley [5] for the shorted 
operator. By [5, Lem. 1], for c £ H and for fixed n, it follows that there exists a 
sequence £ H 2l m = 1,... and a real number M such that 

A" x c + A" 2 y^ —¥ 'Hi(A n )c, m —> 00 

A 2 i c + A^yl^ —> 0, m —> 00 

(A^y^yl) < M, Vm. 

Since A 7 ^ = An, A” 2 = A V2 P h », A% x = P h ^A 21 , and A£ 2 = P H ^A 22 P H n this can 
be written as 

Auc + A 12 PH^ym -t T-Li{A n )c, m -> 00 

Ph% A 2 1 c + P H % A 22 P H n ->• 0 , m -»• 00 

(A 22 P H ny^P H ^) < M , Vm. 

Since these equations only depend on Pfjn y r r l n we can further assume that P( H n.yL = 
0, to = 1,..., where P(h%) a- is the orthogonal projection onto (-ffJ')‘ L C H 2 . That 
is, we can assume that !*/-/j y r r ‘ n = ym = 1,... and therefore 

Anc + A^y^ -»• ‘H 1 (A n )c , m->• 00 

TWj A 2 1 c + T/-/J A 22 y ” ->• 0, to-> 00 (4.6) 

{A 22 y^,y^) < Af, Vm. 
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It follows from 'H\{A n ) < A n that ||i/%i(A")|| < ||v^4”|| for the unique square root, 
guaranteed to exist by Riesz and Sz.-Nagy [33, Sec. 104]. Consequently, Conway [6, 
Prop. II.2.7] implies that ||'Hi(A T1 )|| < ||7l ra || for all n and since ||7f rl || < ||H|| for all 
n it follows that ||'Hi(7l n )j| < ||H|| for all n. Consequently, the sequence 'Hi(A n )c 
is bounded. Therefore there exists a weakly convergent subsequence. Let n! denote 
the index of any weakly convergent subsequence, so that 

'Hi(A n )c ^ d!, n'^oo (4.7) 


for some d! depending on the subsequence. Now the strong convergence of the 
lefthand side to the riglrthand side in (4.6) is maintained for the subsequence n' 
and, since for the subsequence the first term on the riglrthand side converges weakly 
to d !, it follows that we can define a monotonically increasing function m(n') and 


use it to define a new sequence y n ' := y 

such that 


A n c + A 12 y n ' 

UJ 

—>• 

d', 

n! — > 00 

P H 2 ' A ' 2lC + P H$’ A 22 y n 

-)■ 

0 , 

n' — > 00 

(A 22 y n ',y n ') 

< 

M, 

'in'. 


Since Pm is strongly convergent to Pu 2 it follows that P H „/ is strongly convergent 

2 JJ 2 

to Pff 2 , so that P H n'A 21 c converges to A 21 c and P H n' A 22 y n converges to — A 2 \c. 
Moreover, by Reid’s inequality [32, Cor. 2] we have 

\\A 22 y n '\\ 2 H2 < \\A 22 \\(A 22 r',r') < \\A 22 \\M, (4.9) 


for all n ', so that the sequence A 22 y n is bounded. Since weak convergence of a 
bounded sequence on a separable Hilbert space is equivalent to the convergence 
with respect to each element of any orthonormal basis, it follows that A 22 y n is 
weakly convergent to — A 2 \C. From (4.8), we obtain 


A\ 2 y n d' — Anc, n' —> oo 

A 22 y n ' -A 21 c, n'—> oo. 


(4.10) 


From Kakutani’s [21] generalization of the Banach-Saks Theorem it follows that 
we can select a subsequence h of n' such that the Cesaro means of A 22 y n and Ai 2 y n 
converge strongly in (4.10). That is, if we consider the Cesaro means 


z 


h 


1 

h 


Y.r 


we have 

^4l2-^ n, —> d' — Anc, n! —> oo 

A 22 z n —> —A 21 c, n' —> oo . 

Since A 22 > 0 it follows that the function y >->• ( A 22 y , y) is convex, so that ( A 22 z n , z n ) < 
M for all h, so that 


AncAi 2 z n —>■ 
H 21 C + A 22 z h —y 
(A 22 z h ,z h ) < 


d', 

h —> 00 

0 , 

h —>• 00 

M, 

in. 
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It therefore follows from the from the main result of Butler and Morley [5, Thm. 1] 
that 

d! = 'H\{A)c. 

Consequently, by (4.7), we obtain that 

'U 1 {A n ')c^'H 1 {A)c, «'-► oo. (4.11) 

Since this limit is independent of the chosen weakly converging subsequence, it 
follows, see e.g. Zeidler [41, Prop. 10.13], that the full sequence weakly converges to 
the same limit, that is we have 

-Hi(A n )c Hi(A)c, n —► oo , (4.12) 

and since c was arbitrary we conclude that 

'Hi{A) = ui-lim n ^ 00 'H 1 {A n ). 


4.3 Proof of Theorem 3.3 

Let us first establish the assertion when C is compatible with Lf 2 . Consider the 
operator C : H —»• H defined by 

C := Hi(C) + P 2 CP 2 • 


Since C is compatible with H 2 , there exists an oblique projection Q G P((7,-P 2 ), 
and Corach, Maestripieri and Stojanoff [7, Prop. 4.2] asserts that for E := 1 — Q, 
we have 

Ui(C) = CE = E*CE. (4.13) 

Since Q*C = CQ it follows that E*C = CE , and since Q is a projection, it follows 
that QE = EQ = 0 and that LI is a projection. Moreover, since R(Q) = H 2 it 
follows that ker(E ) = H 2 , so that we obtain P 2 Q = Q and EP\ = E and therefore 
Q*P 2 = Q* and P\E* = E*. Consequently, we obtain 


{Pi + Q)*C{Pi + Q) 


(Pi + Q)*{E*CE + P 2 CP 2 )(Pi + Q) 
{P 1 +Q)*{E*CE + P 2 CQ) 

E*CE + Q*CQ 
CE + CQ 
C, 


that is, 

C={Pi + Q)*C{Pi + Q). (4.14) 


Since Q is a projection onto ff 2 , it follows that Pl + Q is lower triangular in 
its partitioned representation and therefore the fundamental pivot produces an ex¬ 
plicit, and most importantly continuous, inverse. Indeed, if we use the partition 
representation 


Q = 


0 0 
Q l 
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we see that 


from which we conclude that 


(Pi + Q) — 


(Pi + Q) = 


1 0 
Q 1 


1 0 
-Q l 


Without partitioning, using PiQ = 0 and QP 2 = P 2 , we obtain 


(2 — Pi — Q){P\ + Q) 


2Pi + 2 Q- (P 2 + PiQ + QPi + Q 2 ) 
2Pl + 2Q - Pi - PiQ - QPi - Q 
Pi + Q - QPi 
Pi + QP 2 
Pl + p2 
1 


and so confirm that 

(Pi + Q) -1 = 2 - Pi - Q . (4.15) 

Following the proof of Hairer, Stuart, Voss, and Wiber [18, Lem. 4.3], let Af(m. C) 
denote the Gaussian measure with mean m and covariance operator C and consider 
the transformation 

{Pi+Q)~* : H ^ H, 

where we use the notation A~* for (A -1 )* = (A*) -1 . From (4.14) we obtain 


(Pi + Q) _ *C'(Pi 4- Q) _1 = C 


(4.16) 


so that the transformation law for Gaussian measures, see Maniglia and Rhandi [28, 
Ch. 1, Lem. 1.2.7], implies that 


((Pi + Q)~*)U(m, C) = A7((Pi + Q)~*m, C). 


Since 


we obtain 


and therefore 


(Pi + Q)" 1 = 


(Pi + Q)~* = 


1 0 
-Q 1 

1 -Q" 
0 1 


(Pi + Q)-*m=( mi ^* m2 ) . 


Since the partition representation of C is 

C = 


(Pi(C))n 0 
0 C22 


the components of the corresponding Gaussian random variable are uncorrelated 
and therefore independent. That is, we have 

M((Pi + Q)~*m , C) = AT (mi - Q*m 2 , (^i(C')) n )W(m 2 , C 22 ) . 
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This independence facilitates the computation of the conditional measure as follows. 
Let X = (Xi, X 2 ) denote the random variable associated with the Gaussian measure 
Af(m, C) and consider the transformed random variable Y = (Pi + Q)~*X with the 
product law M(mi — Q*m 2 , ('Hi{C))u)Af(m 2 , C 22 ). Then, 

Vi = X 1 -Q*X 2 

y 2 = x 2 

can be used to compute the conditional expectation as 

E[Xi|X 2 ] = E[X 1 -Q*X 2 \X 2 ] + E[Q*X 2 \X 2 ] 

= E[yi|F 2 )]+E[Q*X 2 |X 2 ] 

= E[Y 1 ] + Q*X 2 , 

obtaining 

E[X 1 \X 2 ]=E[Y 1 )+Q*X 2 , (4.17) 

so that we conclude that 

E[A'i|X 2 ] = mi + Q*(X 2 — ro 2 ). 

A similar calculation obtains the covariance 

C(X\X 2 ) = n 1 (C), (4.18) 

thus establishing the assertion in the compatible case. 

For the general case, we do not assume that C is compatible with H 2 . Consider 
an ordered orthonormal basis for H 2l let PJ - denote the span of the first n basis 
elements, let P n := Pfj 1 +Ph% denote the orthogonal projection onto Pi ©PJ - , and 
consider the sequence of Gaussian measures fj n := P”/r with the mean P n m and 
covariance operators 

C n :=P n CP n , n = l,... 

As asserted in Theorem 3.1, C n is compatible with H 2 for all n , and the sequence 
'Hi(C n ) converges weakly to Pi(C). Let C(Xi\X%) and C(Xi\X 2 ) denote the con¬ 
ditional covariance operators associated with the measure / 1 . Then we will show that 
C(X 1 IX 2 ) = r Hi(C n ), so that the assertion regarding the conditional covariance op¬ 
erators is established if we demonstrate that the sequence of conditional covariance 
operators C(X i|X^) converges weakly to C(Xi\X 2 ). 

To both ends, consider the Lebesgue-Bochner space L 2 (H, n,B) space of (equiv¬ 
alence classes) of P-valued Borel measurable functions on H whose squared norm 

WfWh(H^B) ■■= [ \\f{x)f H d„{x) 

J H 

is integrable. Since Fernique’s Theorem [15], see [10, Thm. 2.6], implies that the 
random variable X is square Bochner integrable, it follows that the Gaussian random 
variables P n X are also square Bochner integrable with respect to /i. Let us denote 
B 2 := {(Pi, T) : T € B(H 2 )} and := {{H l ,T n , (#£)-*-) : T n e B(H?)}, and let 
/j, n := P"/i denote the image under the projection. /. i r is a Gaussian measure on H 
with mean P n m and covariance C n . 
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Now consider a function / : H —»• H which is Bochner square integrable with 
respect to /i and satisfies foP n = /. Then, using the change of variables formula for 
Bochner integrals, see Bashirov [3, Thm. 2.26], along with the fact that (P n ) _1 P 2 = 
P 2 , and using the fact that for an arbitrary P 2 -measurable function g we have 
g = g o P n , it follows that for A G P 2 , we have 





/ o P n dg 




fdg 


' (P n )~ 1 A 


(P n )~ 1 A 


* (P n )~ 1 A 

f E m [/| B^]d^ 


E ll [f\(P n )- 1 B 2 \d f i 

Erimjdn 

E fl [f\B^]oP n d^ 


we obtain 

E^/IBa] =E„[/|BJ], (4.19) 

and conclude that the sequence E Mn [/|Z? 2 ], n = 1. .. is a martingale corresponding 
to the increasing family of cr-algebras B%. Moreover, it is easy to see that (4.19) 
holds for real valued functions f : H —> R which are square integrable with respect 
to fji and satisfy foP n = f. With the choice / := X\, we clearly have X\oP n = X\, 
so that if we denote X 2 := P n X 2 , we conclude that the sequence 

E lin [X 1 \X 2 ]=E fl [X 1 \X2], n = 1,... (4.20) 

is a martingale. Since conditional expectation is a contraction, it follows that the L 2 
norm of all the conditional expectations are uniformly bounded by the L 2 norm of X. 
Then by the Martingale Convergence Theorem of Diestel and Uhl [12, Cor. V.2.2], 
E Mn [Xi|X 2 ] converges to E M [Xi|X' 2 ] in L 2 (H, g,, B). 

For the conditional covariance operators, observe that (4.20) implies that 

X - E m „ [X|X 2 ] = X! - E^ [Xi |X 2 n ] (4.21) 

for all n, so that for hi, /i 2 £ H, we have 


(Cpn (X|X 2 )/ii,/i 2 > 


E„ 


(X - E m „[X|X 2 ], hi){X - E Mb [X|X 2 ], fc 2 )|X 2 


E„ 


(X 1 -E^[X 1 |X 2 "],/ ll )(X 1 


E m [XilX^ 1 ], h 2 )|X 2 


and since the integrand / := (X 1 — E^XilXJ], ft-i)(Xi — E^XijX 2 ],/i 2 ) satisfies 
f o P n = /, it follows from (4.19) that 


E„ 


= E„ 


(X 1 ~E ti [X 1 \X^],h 1 )(X 1 -E^[X 1 \X^},h 2 )\X 2 

(X 1 -E„[X 1 \X2],h 1 )(X 1 -E ll [X 1 \X2],h2)\X r i 
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so that using the theorem of normal correlation, we obtain 


{C lin (X\X 2 )h 1 ,h2)=E li [(X 1 -E li [X 1 \X$],h 1 )(X 1 -E lt [X 1 \X2],h2)\ ■ (4-22) 

Since the theorem of normal correlation also shows that 

{C^{X\X 2 )h\ 1 h 2 ) := (A — E M [A|A2], /ii)(A — E^fA \X2], h 2 )\X 2 

= Ej(X -E li [X\X 2 ],h 1 )(X 


= E„ 


(Ai — E m [Ai|A 2 ], /ii)(Ai — E m [Ai| A2], h 2) 


the difference in the covariances can be decomposed as 

{C^X^hxM) - (^(A 1 |A 2 )/1 1 ,/1 2 ) 


= E„ 


= E,, 


(X 1 -E IM [X 1 \X2],h 1 )(X 1 -E li [X 1 \X2],h 2 ) 

—E m (Ai — E fJ [Xi|X 2 ], hi)(Xi — E /1 [Ai|A 2 ], / 12 ) 

<E M [A 1 |A 2 ] -E #1 [A 1 | X^],h 1 )(X 1 ,h 2 )] +E ll [(X 1 ,h 1 )(E lM [X 1 \X 2 ]-E fl [X 1 \X2],h 2 ) 
(E m [Ai|AJ], h 1 )( E m [X x | A£], h 2 ) 1 - E m f(E„ [Xj.\X 2 ] , /ii)(E M [A,| A 2 ], h 2 ) 


+E/j 


where the last term can be decomposed as 


E„ 


E„ 


E„ 


<E m [A, |AJ], h, )(E m [Ai |A 2 "], h 2 )] - E M [<E M [AiIA 2 ] M) (E m [A,|A 2 ], h 2 ) 
(E m ^ |A"] - E m [A i|A 2 ], fti) (E m [A x |A”], h 2 ) 

<E m [A, | A 2 ], h x ) <E M [Ai | A 2 n ] - E m [A x \X 2 ],h 2 ) 


Then since conditional expectation is a contraction on L 2 (H, [i,B) it follows that 
||E M [Ai|A 2 ]||L 2 (ff, M) B) < ||Ai|| l 2 (h,»,b) and ||E /i [Ai|AJ]|| i2 ( H>M>B) < \\X 1 \\ L ^ Htll:B ' ) 
for all n. Moreover, since E m [Ai|A^] converges to E m [Ai|A' 2 ] in L 2 (H,fi,B) it 
follows, see e.g. [1, Lem. 11.45], that (E m [Ai|A 2 ],h) converges to (E^[Ai|A 2 ],h) 
in L 2 (K, /i, B) for all h £ H. Therefore, the Cauchy-Schwartz inequality applied four 
times in the above decomposition implies that 


lim (C M (A| X 2 )h\, h 2 ) — {C tl {X\\X 2 )hi, h 2 ), h\,h 2 € H 

n—too 

so that we obtain 

Cfi( A|A 2 ) = uj-lim n—yooC' Hn (X\X 2 ). 

Since C n is compatible with H 2 for all n. and the compatible case demonstrated in 
(4.18) that 

C7 Mb (A|A 2 ) =Hi(C n ) (4.23) 

for all n, and Theorem 3.1 asserts that 


Hi(C) = u-lim^ooUx {C n ), 

we conclude that C M ( A|A 2 ) = 'Hi(C'), establishing the assertion regarding the co- 
variance operators. 
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For the means, observe that since /z is a probability measure, it follows that 
A' and therefore X\ lie in the Lebesgue-Bochner space I/ 1 (17, fi,B), and since by 
Diestel and Uhl [12, Thm. V.1.4] the conditional expectation operators are also 
contractions on L 1 ( H , /x, B) it also follows that E Mn [Xi | X 2 \ converges to E M [Xi\X 2 ] 
in L 1 (i?, /z, B). Therefore, Diestel and Uhl [12, Thm. V.2.8] implies that E Mn [A 1 IA 2 ] 
converges to £ i / _ i [Xi|A 2 ] a.e.-/z. Let the conditional means E m [A|X 2 ] be denoted 
by E m [A|A 2 ] = mt,t £ H 2 . Then, since 



is the mean of the measure /z„, the assertion in the compatible case demonstrated 
that the conditional means E Mn [A'lA^ = m",t £ H 2 are 

m n = (rill + Qn(t - p H?m 2 )^ 

Since the conditional means E^ [A'i | A 2 ] converge to the conditional means E p [Xi |X 2 ] 
a.e.-/z amounts to — > m t for /z-almost every t, the first assertion regarding the 

means is also proved. Now suppose that Q n eventually becomes the special element 
Qn = Qc n ,H 2 defined near (3.2). Then, by definition, R(Q n ) C R{C 22 ) so that 
ker(Q * n ) D 'r{c% 2 ) x , but since Q 2 = H 2 C n U* = U 2 P n CP n U* = H 2 P H ^CP HS U*, 
it follows that R(C^ 2 ) C H 2 and therefore R(C 22 ) 1 - D (Hg) 1 - so that fcer(Q*) D 
(H 2 ) ± - Therefore Q^Ph^ = Qni so that the final assertion follows from the previ¬ 
ous. 


4.4 Proof of Corollary 3.4 

By Mourier’s Theorem, there exists a Gaussian measure fi on H with mean 0 
and covariance operator C := A. Looking at the end of the proof of Theorem 
3.3, since conditional expectation is a contraction on L 2 (H,/z,£?) it follows that 
||E /J [Ai|X 2 ]|| i2 ( ffj/ijB ) < ||Ai|| l 2 (zz, m ,b) and ||E m [Ai|XJ]|| l 2 (h,^,b) < 
for all n. Therefore, for h G H , it follows from the Cauchy-Scliwartz inequality 
that IKE^XilAJ], h)|| i2 (R i/J ,B) < \\Xi\\ L 2 ^ h ^^' ) and ||(E M [AUi|AT 2 ], /x)|| z. 2 (M,zt,s) < 
\\Xi\\l 2 (h,ij,b) f° r & h uniformly for h £ H with \\Ii\\h < 1. Therefore, the Cauchy- 
Schwartz inequality applied four times in the decomposition at the end of the proof 
of Theorem 3.3 implies that 

lim (C Mn (X\X 2 )h\, h 2 ) = (C' A1 (Xi|A 2 )hi, /i 2 ), h\,h 2 £ H 

n—t 00 

uniformly for h±,h 2 £ H with ||hi||zr < 1 and ||/i 2 ||zz < 1. Therefore, it follows from 
Halmos [19, Prob. 107] that the sequence of covariance operators converges 

C^ n (X\X 2 ) C^X\X 2 ) 

in the uniform operator topology. 

According to Maniglia and Rhandi [28, Ch. 1, Lem. 1.1.4] or Da Prato and 
Zabczyk [10, Prop. 2.16], for a Gaussian measure /z with mean 0 and covariance 
operator C , we have 

tr(C)=E A 1 ||X|| 2 . 
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From (4.22), by shifting to the center, we obtain that 


tr{C^{X\X 2 )) =E m ||Xl 


E4^il^]l| 2 


and 

lIXi-E^AdlXa]!! 2 ] , 

and therefore the difference is 


fr(C M (A |A 2 )) =E„ 


tr( C ^(X\X 2 )) -tr(C^(X\X 2 )) 


= E„ 


= E„ 




nii|2 


-E u 


||A 1 -E m [A 1 |A 2 ] 


(E M [Ai|A?] -E^fAilAa], E^A?] +E M [A 1 |A 2 ] 



Therefore, the Cauchy-Schwartz inequality, the L 2 convergence of E ;j . [A-j |A^'] to 
E^fAijA^], and the uniform L 2 boundedness of E^JAilAJ], E m [Ai|A 2 ] and Ai, 
implies that 

lim tr^Cft (A|A 2 )) = tr{C^{ A|A 2 )) . 

n—too 

Since C fln ( A|A 2 ) —► C P (A|A' 2 ) in the uniform operator topology, it follows from 
Kubrusly [24], see [23, Thm. 2 & Sec. 4], that C fin ( A|A 2 ) —> C /J: (A|A 2 ) in the trace 
norm topology. Since (4.23) asserts that C Mn (A|A 2 ) = Hi(C n ) and Theorem 3.3 
asserts that (7 M (A|A 2 ) = Hi(C), the identification A := C completes the proof. 
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